Mapping the Paraphrase Database to WordNet
نویسندگان
چکیده
WordNet has facilitated important research in natural language processing but its usefulness is somewhat limited by its relatively small lexical coverage. The Paraphrase Database (PPDB) covers 650 times more words, but lacks the semantic structure of WordNet that would make it more directly useful for downstream tasks. We present a method for mapping words from PPDB to WordNet synsets with 89% accuracy. The mapping also lays important groundwork for incorporating WordNet’s relations into PPDB so as to increase its utility for semantic reasoning in applications.
منابع مشابه
ASE@DPIL-FIRE2016: Hindi Paraphrase Detection using Natural Language Processing Techniques & Semantic Similarity Computations
The paper reports the approaches utilized and results achieved for our system in the shared task (in FIRE-2016) for paraphrase identification in Indian languages (DPIL). Since Indian languages have a complex inherent nature, paraphrase identification in these languages becomes a challenging task. In the DPIL task, the challenge is to detect and identify whether a given sentence pairs paraphrase...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملAn Algorithm to Find Words from Definitions
This paper presents a system to find automatically words from a definition or a paraphrase. The system uses a lexical database of French words that is comparable in its size to WordNet and an algorithm that evaluates distances in the semantic graph between hypernyms and hyponyms of the words in the definition. The paper first outlines the structure of the lexical network on which the method is ...
متن کاملLearning Paraphrase Models from Google New Headlines
Data sources like the clusters of news headlines at Google News present an exciting opportunity to learn paraphrase models from data automatically. We present both a novel dataset and a novel approach to automatic, unsupervised learning of paraphrase models from that datset. Leveraging existing NLP tools such as the Stanford Parser and lexical resources such as WordNet and Infomap, we construct...
متن کاملA Lexical Database and an Algorithm to Find Words from Definitions
This paper presents a system to find automatically words from a definition or a paraphrase. The system uses a lexical database of French words that is comparable in its size to WordNet and an algorithm that evaluates distances in the semantic graph between hypernyms and hyponyms of the words in the definition. The paper first outlines the structure of the lexical network on which the method is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017